-
Notifications
You must be signed in to change notification settings - Fork 316
Replace JCTools queues with VarHandle-based implementations for Java 9+ #9896
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
|
🎯 Code Coverage 🔗 Commit SHA: b2850b3 | Docs | Datadog PR Page | Was this helpful? Give us feedback! |
Debugger benchmarksParameters
See matching parameters
SummaryFound 0 performance improvements and 0 performance regressions! Performance is the same for 10 metrics, 5 unstable metrics. See unchanged results
Request duration reports for reportsgantt
title reports - request duration [CI 0.99] : candidate=None, baseline=None
dateFormat X
axisFormat %s
section baseline
noprobe (317.508 µs) : 291, 344
. : milestone, 318,
basic (294.053 µs) : 287, 301
. : milestone, 294,
loop (8.959 ms) : 8956, 8963
. : milestone, 8959,
section candidate
noprobe (319.585 µs) : 290, 349
. : milestone, 320,
basic (293.196 µs) : 286, 300
. : milestone, 293,
loop (8.955 ms) : 8952, 8958
. : milestone, 8955,
|
BenchmarksStartupParameters
See matching parameters
SummaryFound 4 performance improvements and 2 performance regressions! Performance is the same for 50 metrics, 9 unstable metrics.
Startup time reports for petclinicgantt
title petclinic - global startup overhead: candidate=1.56.0-SNAPSHOT~b2850b3848, baseline=1.56.0-SNAPSHOT~5db793a092
dateFormat X
axisFormat %s
section tracing
Agent [baseline] (1.059 s) : 0, 1058740
Total [baseline] (10.867 s) : 0, 10866527
Agent [candidate] (1.043 s) : 0, 1043177
Total [candidate] (4.277 s) : 0, 4277211
section appsec
Agent [baseline] (1.223 s) : 0, 1222917
Total [baseline] (10.887 s) : 0, 10886742
Agent [candidate] (1.224 s) : 0, 1223637
Total [candidate] (10.929 s) : 0, 10928856
section iast
Agent [baseline] (1.179 s) : 0, 1179245
Total [baseline] (11.166 s) : 0, 11166079
Agent [candidate] (1.181 s) : 0, 1181262
Total [candidate] (11.235 s) : 0, 11235160
section profiling
Agent [baseline] (1.203 s) : 0, 1203412
Total [baseline] (10.852 s) : 0, 10851559
Agent [candidate] (1.194 s) : 0, 1194394
Total [candidate] (10.891 s) : 0, 10890996
gantt
title petclinic - break down per module: candidate=1.56.0-SNAPSHOT~b2850b3848, baseline=1.56.0-SNAPSHOT~5db793a092
dateFormat X
axisFormat %s
section tracing
crashtracking [baseline] (1.48 ms) : 0, 1480
crashtracking [candidate] (1.445 ms) : 0, 1445
BytebuddyAgent [baseline] (713.857 ms) : 0, 713857
BytebuddyAgent [candidate] (705.626 ms) : 0, 705626
GlobalTracer [baseline] (247.826 ms) : 0, 247826
GlobalTracer [candidate] (239.963 ms) : 0, 239963
AppSec [baseline] (32.587 ms) : 0, 32587
AppSec [candidate] (32.789 ms) : 0, 32789
Debugger [baseline] (6.466 ms) : 0, 6466
Debugger [candidate] (6.327 ms) : 0, 6327
Remote Config [baseline] (712.572 µs) : 0, 713
Remote Config [candidate] (697.274 µs) : 0, 697
Telemetry [baseline] (15.081 ms) : 0, 15081
Telemetry [candidate] (9.418 ms) : 0, 9418
Flare Poller [baseline] (5.767 ms) : 0, 5767
Flare Poller [candidate] (12.402 ms) : 0, 12402
section appsec
crashtracking [baseline] (1.461 ms) : 0, 1461
crashtracking [candidate] (1.46 ms) : 0, 1460
BytebuddyAgent [baseline] (729.65 ms) : 0, 729650
BytebuddyAgent [candidate] (733.206 ms) : 0, 733206
GlobalTracer [baseline] (238.051 ms) : 0, 238051
GlobalTracer [candidate] (233.694 ms) : 0, 233694
IAST [baseline] (24.728 ms) : 0, 24728
IAST [candidate] (25.06 ms) : 0, 25060
AppSec [baseline] (174.967 ms) : 0, 174967
AppSec [candidate] (175.181 ms) : 0, 175181
Debugger [baseline] (5.982 ms) : 0, 5982
Debugger [candidate] (6.191 ms) : 0, 6191
Remote Config [baseline] (655.024 µs) : 0, 655
Remote Config [candidate] (672.866 µs) : 0, 673
Telemetry [baseline] (8.541 ms) : 0, 8541
Telemetry [candidate] (8.897 ms) : 0, 8897
Flare Poller [baseline] (3.986 ms) : 0, 3986
Flare Poller [candidate] (4.121 ms) : 0, 4121
section iast
crashtracking [baseline] (1.454 ms) : 0, 1454
crashtracking [candidate] (1.469 ms) : 0, 1469
BytebuddyAgent [baseline] (827.554 ms) : 0, 827554
BytebuddyAgent [candidate] (832.83 ms) : 0, 832830
GlobalTracer [baseline] (234.764 ms) : 0, 234764
GlobalTracer [candidate] (230.839 ms) : 0, 230839
IAST [baseline] (31.572 ms) : 0, 31572
IAST [candidate] (33.789 ms) : 0, 33789
AppSec [baseline] (29.934 ms) : 0, 29934
AppSec [candidate] (28.3 ms) : 0, 28300
Debugger [baseline] (5.991 ms) : 0, 5991
Debugger [candidate] (6.02 ms) : 0, 6020
Remote Config [baseline] (591.414 µs) : 0, 591
Remote Config [candidate] (595.394 µs) : 0, 595
Telemetry [baseline] (8.46 ms) : 0, 8460
Telemetry [candidate] (8.351 ms) : 0, 8351
Flare Poller [baseline] (4.162 ms) : 0, 4162
Flare Poller [candidate] (4.139 ms) : 0, 4139
section profiling
crashtracking [baseline] (1.458 ms) : 0, 1458
crashtracking [candidate] (1.463 ms) : 0, 1463
BytebuddyAgent [baseline] (735.903 ms) : 0, 735903
BytebuddyAgent [candidate] (733.894 ms) : 0, 733894
GlobalTracer [baseline] (223.769 ms) : 0, 223769
GlobalTracer [candidate] (218.058 ms) : 0, 218058
AppSec [baseline] (32.439 ms) : 0, 32439
AppSec [candidate] (32.22 ms) : 0, 32220
Debugger [baseline] (7.601 ms) : 0, 7601
Debugger [candidate] (6.5 ms) : 0, 6500
Remote Config [baseline] (1.407 ms) : 0, 1407
Remote Config [candidate] (683.443 µs) : 0, 683
Telemetry [baseline] (14.601 ms) : 0, 14601
Telemetry [candidate] (16.252 ms) : 0, 16252
Flare Poller [baseline] (4.171 ms) : 0, 4171
Flare Poller [candidate] (4.118 ms) : 0, 4118
ProfilingAgent [baseline] (111.709 ms) : 0, 111709
ProfilingAgent [candidate] (110.834 ms) : 0, 110834
Profiling [baseline] (112.374 ms) : 0, 112374
Profiling [candidate] (111.469 ms) : 0, 111469
Startup time reports for insecure-bankgantt
title insecure-bank - global startup overhead: candidate=1.56.0-SNAPSHOT~b2850b3848, baseline=1.56.0-SNAPSHOT~5db793a092
dateFormat X
axisFormat %s
section tracing
Agent [baseline] (1.049 s) : 0, 1049107
Total [baseline] (8.664 s) : 0, 8663597
Agent [candidate] (1.041 s) : 0, 1041300
Total [candidate] (8.657 s) : 0, 8656689
section iast
Agent [baseline] (1.181 s) : 0, 1181383
Total [baseline] (9.278 s) : 0, 9277927
Agent [candidate] (1.182 s) : 0, 1182254
Total [candidate] (9.277 s) : 0, 9276882
gantt
title insecure-bank - break down per module: candidate=1.56.0-SNAPSHOT~b2850b3848, baseline=1.56.0-SNAPSHOT~5db793a092
dateFormat X
axisFormat %s
section tracing
crashtracking [baseline] (1.46 ms) : 0, 1460
crashtracking [candidate] (1.459 ms) : 0, 1459
BytebuddyAgent [baseline] (706.057 ms) : 0, 706057
BytebuddyAgent [candidate] (704.649 ms) : 0, 704649
GlobalTracer [baseline] (246.218 ms) : 0, 246218
GlobalTracer [candidate] (239.889 ms) : 0, 239889
AppSec [baseline] (32.536 ms) : 0, 32536
AppSec [candidate] (32.545 ms) : 0, 32545
Debugger [baseline] (6.439 ms) : 0, 6439
Debugger [candidate] (6.323 ms) : 0, 6323
Remote Config [baseline] (720.415 µs) : 0, 720
Remote Config [candidate] (690.347 µs) : 0, 690
Telemetry [baseline] (13.697 ms) : 0, 13697
Telemetry [candidate] (10.922 ms) : 0, 10922
Flare Poller [baseline] (7.228 ms) : 0, 7228
Flare Poller [candidate] (10.171 ms) : 0, 10171
section iast
crashtracking [baseline] (1.465 ms) : 0, 1465
crashtracking [candidate] (1.502 ms) : 0, 1502
BytebuddyAgent [baseline] (829.268 ms) : 0, 829268
BytebuddyAgent [candidate] (836.861 ms) : 0, 836861
GlobalTracer [baseline] (235.197 ms) : 0, 235197
GlobalTracer [candidate] (228.18 ms) : 0, 228180
AppSec [baseline] (27.897 ms) : 0, 27897
AppSec [candidate] (29.007 ms) : 0, 29007
Debugger [baseline] (6.083 ms) : 0, 6083
Debugger [candidate] (5.965 ms) : 0, 5965
Remote Config [baseline] (606.581 µs) : 0, 607
Remote Config [candidate] (580.017 µs) : 0, 580
Telemetry [baseline] (8.469 ms) : 0, 8469
Telemetry [candidate] (8.407 ms) : 0, 8407
Flare Poller [baseline] (4.14 ms) : 0, 4140
Flare Poller [candidate] (4.057 ms) : 0, 4057
IAST [baseline] (33.369 ms) : 0, 33369
IAST [candidate] (32.571 ms) : 0, 32571
LoadParameters
See matching parameters
SummaryFound 4 performance improvements and 0 performance regressions! Performance is the same for 16 metrics, 16 unstable metrics.
Request duration reports for insecure-bankgantt
title insecure-bank - request duration [CI 0.99] : candidate=1.56.0-SNAPSHOT~b2850b3848, baseline=1.56.0-SNAPSHOT~5db793a092
dateFormat X
axisFormat %s
section baseline
no_agent (1.179 ms) : 1167, 1190
. : milestone, 1179,
iast (3.198 ms) : 3158, 3238
. : milestone, 3198,
iast_FULL (5.595 ms) : 5539, 5650
. : milestone, 5595,
iast_GLOBAL (3.649 ms) : 3588, 3710
. : milestone, 3649,
profiling (2.091 ms) : 2071, 2110
. : milestone, 2091,
tracing (1.807 ms) : 1792, 1823
. : milestone, 1807,
section candidate
no_agent (1.2 ms) : 1188, 1212
. : milestone, 1200,
iast (3.071 ms) : 3030, 3112
. : milestone, 3071,
iast_FULL (5.672 ms) : 5615, 5730
. : milestone, 5672,
iast_GLOBAL (3.507 ms) : 3456, 3559
. : milestone, 3507,
profiling (2.136 ms) : 2117, 2155
. : milestone, 2136,
tracing (1.761 ms) : 1748, 1775
. : milestone, 1761,
Request duration reports for petclinicgantt
title petclinic - request duration [CI 0.99] : candidate=1.56.0-SNAPSHOT~b2850b3848, baseline=1.56.0-SNAPSHOT~5db793a092
dateFormat X
axisFormat %s
section baseline
no_agent (17.252 ms) : 17077, 17427
. : milestone, 17252,
appsec (18.558 ms) : 18371, 18744
. : milestone, 18558,
code_origins (17.479 ms) : 17302, 17656
. : milestone, 17479,
iast (17.4 ms) : 17228, 17573
. : milestone, 17400,
profiling (19.953 ms) : 19750, 20156
. : milestone, 19953,
tracing (17.764 ms) : 17587, 17941
. : milestone, 17764,
section candidate
no_agent (17.34 ms) : 17165, 17515
. : milestone, 17340,
appsec (18.474 ms) : 18286, 18662
. : milestone, 18474,
code_origins (17.629 ms) : 17455, 17802
. : milestone, 17629,
iast (17.652 ms) : 17477, 17828
. : milestone, 17652,
profiling (18.551 ms) : 18364, 18739
. : milestone, 18551,
tracing (17.49 ms) : 17318, 17663
. : milestone, 17490,
DacapoParameters
See matching parameters
SummaryFound 0 performance improvements and 0 performance regressions! Performance is the same for 11 metrics, 1 unstable metrics. Execution time for tomcatgantt
title tomcat - execution time [CI 0.99] : candidate=1.56.0-SNAPSHOT~b2850b3848, baseline=1.56.0-SNAPSHOT~5db793a092
dateFormat X
axisFormat %s
section baseline
no_agent (1.479 ms) : 1467, 1491
. : milestone, 1479,
appsec (3.636 ms) : 3421, 3851
. : milestone, 3636,
iast (2.215 ms) : 2151, 2279
. : milestone, 2215,
iast_GLOBAL (2.252 ms) : 2188, 2316
. : milestone, 2252,
profiling (2.076 ms) : 2023, 2129
. : milestone, 2076,
tracing (2.029 ms) : 1980, 2079
. : milestone, 2029,
section candidate
no_agent (1.478 ms) : 1467, 1490
. : milestone, 1478,
appsec (3.691 ms) : 3470, 3911
. : milestone, 3691,
iast (2.213 ms) : 2149, 2276
. : milestone, 2213,
iast_GLOBAL (2.254 ms) : 2190, 2318
. : milestone, 2254,
profiling (2.05 ms) : 1999, 2101
. : milestone, 2050,
tracing (2.031 ms) : 1981, 2080
. : milestone, 2031,
Execution time for biojavagantt
title biojava - execution time [CI 0.99] : candidate=1.56.0-SNAPSHOT~b2850b3848, baseline=1.56.0-SNAPSHOT~5db793a092
dateFormat X
axisFormat %s
section baseline
no_agent (15.543 s) : 15543000, 15543000
. : milestone, 15543000,
appsec (15.361 s) : 15361000, 15361000
. : milestone, 15361000,
iast (18.788 s) : 18788000, 18788000
. : milestone, 18788000,
iast_GLOBAL (18.011 s) : 18011000, 18011000
. : milestone, 18011000,
profiling (14.931 s) : 14931000, 14931000
. : milestone, 14931000,
tracing (14.745 s) : 14745000, 14745000
. : milestone, 14745000,
section candidate
no_agent (15.515 s) : 15515000, 15515000
. : milestone, 15515000,
appsec (14.953 s) : 14953000, 14953000
. : milestone, 14953000,
iast (18.588 s) : 18588000, 18588000
. : milestone, 18588000,
iast_GLOBAL (17.77 s) : 17770000, 17770000
. : milestone, 17770000,
profiling (14.743 s) : 14743000, 14743000
. : milestone, 14743000,
tracing (14.817 s) : 14817000, 14817000
. : milestone, 14817000,
|
229f67a to
374d13d
Compare
This reverts commit 14cc597.
2721d41 to
0a72587
Compare
21e0a65 to
259eeb5
Compare
|
Hi! 👋 Thanks for your pull request! 🎉 To help us review it, please make sure to:
If you need help, please check our contributing guidelines. |
9e7acbe to
b2850b3
Compare
|
Hi @amarziali I am one of the developers of JCTools and we are super happy if we could bring a var handle generation variant in our lib as well. Note: JCTools is at the very core of other frameworks which will soon hit the "no unsafe world" JVM barrier, including Netty. |
|
|
||
| // Padding to avoid false sharing | ||
| @SuppressWarnings("unused") | ||
| private long p0, p1, p2, p3, p4, p5, p6; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
FYI the JVM is free to reorder fields, so to guarantee this padding you need to use an artificial class hierarchy that stops all the padding from being re-ordered to the end.
See https://sanjeev.pages.dev/false-sharing-cache-line-padding/#manualexplicit-padding-in-java
and the padded hierarchy in https://github.com/JCTools/JCTools/blob/master/jctools-core/src/main/java/org/jctools/queues/MpscBlockingConsumerArrayQueue.java
What Does This Do
This PR introduces a set of queue implementations in order to replace the JCTools-based queues, eliminating direct usage of sun.misc.Unsafe and providing full compatibility with Java 9+ runtimes through the VarHandle API.
The goal is to achieve similar high-performance concurrent queue behavior as JCTools while using supported, standard Java mechanisms.
A new
Queuesfactory class is introduced to dynamically select the optimal queue implementation based on the Java runtime environment:Introduced Classes Summary
SpscArrayQueueVarHandleSpmcArrayQueueVarHandleconsumerLimitcaching to reduce volatile contention.MpscArrayQueueVarHandle<E>TAIL_HANDLE. Maintains aproducerLimitto minimize volatile head reads.MpscBlockingConsumerArrayQueueVarHandle<E>CONSUMER_THREAD_HANDLEto park/unpark the waiting consumer efficiently.Memory Padding
All queue state fields (
head,tail, cached limits, etc.) are cache-line padded to prevent false sharing between producers and consumers.This ensures that frequently accessed hot fields do not reside on the same cache line across threads, minimizing cache invalidations and improving throughput under contention.
Memory Fence Semantics
Memory fences were explicitly chosen for each access type to minimize volatile overhead while maintaining correct visibility guarantees:
setRelease/getAcquirefor publishing and consuming elements — provides correct inter-thread ordering without full barriers.setOpaque/getOpaquefor relaxed head/tail updates — avoids unnecessary synchronization costs where ordering is not required.getVolatileonly used when full memory fences are really required (e.g. refreshing limits to ensure visibility when the queue might be full or empty).Queue Benchmark Results (ops/us)
Note: SPSC benchmark shows contentions on slow path (i.e. queue is full/queue is empty). This should less frequently happen in our case. Increasing the queue size (hence reducing the probability that's full) shows good performances.
MPSCBlockingConsumer Queue Benchmark (ops/us)
MPSC Queue Benchmark (ops/us)
SPSC Queue Benchmark (ops/us)
Takeaways:
Room for future improvements
In high-throughput scenarios where multiple producers compete for queue space, contention on the CAS operation can become a bottleneck.
Idea to mitigate this, when the queue is likely not full, a
getAndAddoperation can be used instead of a CAS to claim slots since it will never fail. This optimization allows multiple producers to advance the tail index with reduced atomic contention. However, when the queue is nearly full, the getAndAdd cannot be reliably done hence the classic CAS loop (slow path) can be used instead.Motivation
Additional Notes
Contributor Checklist
type:and (comp:orinst:) labels in addition to any useful labelsclose,fixor any linking keywords when referencing an issue.Use
solvesinstead, and assign the PR milestone to the issueJira ticket: [PROJ-IDENT]